routing: gracefully handle missing node IP during BGP announcements #871
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
In dual-stack or IPv6-enabled clusters, the
agentcan crash when it attempts to announce or withdraw a BGP path for an IPv6 address, but the node does not have a corresponding IPv6 address configured inHostMetadata.RCA
common.MakePath()is called to construct a BGP path for announcement/withdrawal.nodeIPv6 == nil),MakePath()returns an error.routing_server.announceLocalAddress()orprefix_watcher.WatchPrefix().Go()wrapper passes the error to tomb, which enters theDyingstate, signaling all goroutines to stop.Impact
Solution
Implement graceful degradation by treating missing node IP as a non-fatal, recoverable condition.
The missing node IP is a transient or configuration condition, it does not mean the routing component is broken. Returning an error would stop the routing/prefix watchers and trigger the tomb to kill the whole agent.
By returning early, the routing/prefix watchers continue, and missing paths can be restored later. The logic keeps state in memory (
localAddressMap), so once the node IP becomes available,RestoreLocalAddresses()will re-announce.